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Abstract This note serves two purposes. Firstly, we construct a counterexample to show that the state¬ 
ment on the convergence of the alternating direction method of multipliers (ADMM) for solving linearly 
constrained convex optimization problems in a highly influential paper by Boyd et al. [Found. Trends 
Mach. Learn. 3(1) 1-122 (2011)] can be false if no prior condition on the existence of solutions to all 
the subproblems involved is assumed to hold. Secondly, we present fairly mild conditions to guarantee 
the existence of solutions to all the subproblems and provide a rigorous convergence analysis on the 
ADMM, under a more general and useful semi-proximal ADMM (sPADMM) setting considered by Fazel 
et al. [SIAM J. Matrix Anal. Appl. 34(3) 946-977 (2013)], with a computationally more attractive large 
step-length that can even exceed the practically much preferred golden ratio of (1 -I- •\/5)/2. 
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1 Introduction 

Let A, y and Z be three finite-dimensional real Euclidean spaces each endowed with an inner product 
(•,•) and its induced norm ][•]]. Let / : 3^ —>■ (—oo,-boo] and g : Z —> (—oo,-|-oo] be two closed proper 
convex functions and A : ^ y and B : A Z he two linear maps. Consider the following 2-block 
separable convex optimization problem: 

min {/(y)-bg(z) s.t. A*y + B*z = c}, (1) 

y^y.z^Z 

where c S A is the given data and the linear maps A* and B* are the adjoints of A and B, respectively. 
The effective domains of / and g are denoted by dom / and dom g, respectively. 

Let tj > 0 be a given penalty parameter. The augmented Lagrangian function of problem (1) is defined 
by, for any {x,y,z) G X x y x Z, 

^a{.y,z;x) := f{y) + g{z) + {x,A*y + B*z - c) + '^\\A*y + B*z - cl]^ . (2) 
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Choose an initial point S A” x dom / x dom g and a step-length r G (0,-l-oo). The classical 

alternating direction method of multipliers (ADMM) of Glowinski and Marroco [10] and Gabay and 
Mercier [7] then takes the following scheme for fc = 0,1,.. 

r/+i =argmin£„(2/,2^x'=), 

I = argmin£,(/+i,z;a;'=), (3) 

[ + Ta{A*y'^+^ + - c). 

The convergence analysis for the ADMM scheme (3) under certain settings was first conducted by 
Gabay and Mercier [7] , Glowinski [8] and Fortin and Glowinski [6] . One may refer to [1] and [4] for recent 
surveys on this topic and to [9] for a note on the historical development of the ADMM. 

In a highly influential paper^ written by Boyd et al. [1], it was asserted [Section 3.2.1, Page 17] that 
if / and g are closed proper convex functions [1, Assumption 1] and the Lagrangian function of problem 
(1) has a saddle point [1, Assumption 2], then the ADMM scheme (3) converges for r = 1. This, however, 
turns to be false without imposing the prior condition that all the subproblems involved have solutions. To 
demonstrate our claim, in this note we shall provide a simple example (see Section 3) with the following 
four nice properties: 

(PI) Both / and g are closed proper convex functions; 

(P2) The Lagrangian function has infinitely many saddle points; 

(P3) The Slater’s constraint qualihcation (CQ) holds; and 
(P4) The linear operator B is nonsingular. 

Note that our example to be constructed satisfies the two assumptions made in [1], i.e., (PI) and 
(P2), and the two additional favorable properties (P3) and (P4). Yet, the ADMM scheme (3) even with 
T = 1 may not be well-defined for solving problem (1). A closer examination of the proofs given in [1] 
reveals that the authors mistakenly took for granted the existence of solutions to all the subproblems in 
(3) under (PI) and (P2) only. Here we will fix this gap by presenting fairly mild conditions to guarantee 
the existence of solutions to all the subproblems in (3). Moreover, in order to deal with the potentially 
non-solvability issue of the subproblems in the ADMM scheme (3), we shall analyze the convergence of 
the ADMM under a more useful semi-proximal ADMM (sPADMM) setting advocated by Fazel et al. [5], 
with a computationally more attractive large step-length that can even be bigger than the golden ratio 
of (1 -f v^)/2. 

Let 5 : A* —>■ A* and T : Z ^ Z he two self-adjoint positive semidefinite linear operators. Then the 
sPADMM takes the following iteration scheme for /c = 0,1,..., 

( yfc+i = argmin{£^(y, 2 ;'=;a:'’’) -F \\\y-y'"\\s]i 

J z^-+i = argmin{£,(y'=+i,z;cc'=) + i||z-z'=||^}, (4) 

[ a;fc+i = - c). 

The sPADMM scheme (4) with 5 = 0 and T = 0 is nothing but the ADMM scheme (3) and the case 
5 >- 0 and T Y 0 was initiated by Eckstein [3]. Most recent studies have shown that the sPADMM, a 
seemingly mild extension of the classical ADMM, turns out to play a pivotal role in solving multi-block 
convex composite conic programming problems [2,12,15] with a low to medium accuracy. For more details 
on choosing S and T, one may refer to the recent Ph.D thesis of Li [11]. 

The remaining parts of this note are organized as follows. In Section 2, we first present some necessary 
preliminary results from convex analysis for later discussions and then provide conditions under which 
the subproblems in the sPADMM scheme (4) are solvable, or even admit bounded solution sets, so that 
this scheme is well-defined. In Section 3, based on several results established in Section 2, we construct a 
counterexample that satishes (P1)-(P4) to show that the conclusion on the convergence of ADMM scheme 
(3) in [1, Section 3.2.1] can be false without making further assumptions. In Section 4, we establish some 
satisfactory convergence properties for the sPADMM scheme (4) with a computationally more attractive 
large step-length that can even exceed the golden ratio of (1 -I- •\/5)/2, under fairly weak assumptions. We 
conclude this note in Section 5. 
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2 Preliminaries 


Let lA he a, finite dimensional real Euclidean space endowed with an inner product (•, •) and its induced 
norm || • ||. Let O ■. U ^ U he any self-adjoint positive semidefinite linear operator. For any u,u' S U, 
define {u,u')o '■= {u,Ou') and ||u||ci := y/( m, Ou) so that 


(u,u')o = h 


+ W 


— \\u — u 


\o) 


- U to - U 


\o) 


( 5 ) 


For any given set U C U, we denote its relative interior by ri([/) and define 
6u :IJ (— 00 , -boo] by 


6u{u) 


0, if M G U, 

-boo, if u ^ U. 


its indicator function 


Let 9 -.U ^ (—oo, -boo] be a closed proper convex function. We use dom 9 and epi(0) to denote its effective 
domain and its epigraph, respectively. Moreover, we use d9{-) to denote the subdifferential mapping [13, 
Section 23] of 0(-), which is defined by 


d9{u) ■.= {v ^U\ 9{u') > 9{u) -b (u, u' — u) Vu' G y u & U. 


( 6 ) 


It holds that there exists a self-adjoint positive semidefinite linear operator Eg -.U^U such that for any 
u,u' with V G d9{u) and v’ G d9{u'), 

{v-v',u-u') >\\u-u'\\%^. (7) 

Since 9 is closed, proper and convex, by [13, Theorem 8.5] we know that the recession function [13, Section 
8] of 9, denoted by 6>0+, is a positively homogeneous closed proper convex function that can be written 
as, for an arbitrary u' G dom 0, 


6»0+('u) 


9{u +pu) — 9(u) 
hm - 

p—f-l-oo p 


\fuGU. 


The Fenchel conjugate 9*{-) of 0 is a closed proper convex function defined by 


9*{v) := sup {{u,v) — 9{u)}, yvGlA. 

uGU 


Since 9 is closed, by [13, Theorem 23.5] we know that 

V G d9{u) u G d9*{v). (8) 

The dual of problem (1) takes the form of 

max{/i(a;) := —f*{—Ax) — g*{—Bx) — (c,a;)}. (9) 

The Lagrangian function of problem (1) is dehned by 

C{y,z;x) := f{y) + giz) + {x,A*y + B*z - c), \/ (y, z,x) G y x Z x X, (10) 


which is convex in {y,z) G y x Z and concave in x G X. Recall that we say the Slater’s CQ for problem 
(1) holds if 

{iy,z) \ y G ri(dom /), z G ri(dom g), A*y + B*z = c} 

Under the above Slater’s CQ, from [13, Corollaries 28.2.2 & 28.3.1] we know that (y, z) G dom / x dom g 
is a solution to problem (1) if and only if there exists a Lagrangian multiplier x G X such that (x,y,z) 
is a saddle point to the Lagrangian function (10), or, equivalently, {x,y,z) is a solution to the following 
Karush-Kuhn-Tucker (KKT) system 


— AxGdf{y), —BxGdg{z) and A*y + B*z = c. 


( 11 ) 


Furthermore, if the solution set to the KKT system (11) is nonempty, by [13, Theorem 30.4 & Corollary 
30.5.1] we know that a vector {x,yA) G X xy x Z is a solution to (11) if and only if (y, z) is an optimal 
solution to problem (1) and x is an optimal solution to problem (9). 

In the following, we shall conduct discussions on the existence of solutions to the subproblems in the 
sPADMM scheme (4). Let the augmented Lagrangian function Ccr be defined by (2) and S and T be 
two self-adjoint positive semi-definite linear operators used in the sPADMM scheme (4). Let (a;',y',z') G 
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X X dom / X dom g be an arbitrarily given point. Consider the following two auxiliary optimization 
problems: 

minj^g^; ■= \x') X \\\y-y'\\s] (12) 

and 

min;.g 2 {G{z) := Ca{y',z]x') + ^\\z - (13) 

Note that Since z' G dom g, problem (12) is equivalent to 


min {F(y) := f{y) + ^\\A*y + {B*z' - c + x'/aW + kWv - vTs}- 
y^y 


(14) 


We now study under what conditions problems (12) and (13) are solvable or have bounded solution sets. 
For this purpose, we consider the following assumptions: 

Assumption 1 fO'^{y) > 0 for any y G A4, where 

M:={yGy\A*y = 0, Sy = 0}\{y G 3^ | fO+i-y) = -fO+iy) = 0}. 

Assumption 2 g0+(z) > 0 for any z G N, where 

M := {zG Z\B*z = 0, Tz = 0}\{z G Z | gQ+{-z) = - 50 +(z) = 0}. 

Assumption 3 /0+(y) > 0 for any 0^yG{yGy\ A*y = 0,Sy = 0}. 

Assumption 4 g0'^{z) > 0 for any 0 z G {z G Z\B*z = 0,Tz = 0}. 

Note that Assumptions 1-4 are not very restrictive. For example, if both / and g are coercive, in 
particular if they are norm functions, all the four assumptions hold automatically without any other 
conditions. Under the above assumptions, we have the following results. 

Proposition 2.1 It holds that 

(a) Problem (12) is solvable if Assumption 1 holds, and problem (13) is solvable if Assumption 2 holds. 

(b) The solution set to problem (12) is nonempty and bounded if and only if Assumption 3 holds, and 
the solution set to problem (13) is nonempty and bounded if and only if Assumption 4 holds. 

Proof (a) We first show that when Assumption 1 holds, the solution set to problem (12) is not empty. 
Consider the recession function F0+ of F. On the one hand, by using [13, Theorem 9.3] and the second 
example given in [13, Pages 67-68], we know that for any y Gy such that A*y 7 ^ 0 or 0, one must 
have FQ'^{y) = + 00 . On the other hand, for any y Gy such that A*y = 0 and Sy = 0, by the definition 
of F{y) in (14) we have 

= f0+{y) -L {aA{B*z' - c -k A/a) - Sy', y) = /0+(y). 

Hence, by Assumption 1 we know that FQ^{y) > 0 for all y G 3^ except for those satisfying FQ^{—y) = 
—F0^{y) = 0. Then, from [13, (b) in Corollary 13.3.4], it holds that 0 G ri(dom F*). Furthermore, by [13, 
Theorem 23.4] we know that dF*{0) is a nonempty set, i.e., there exists a. y G y such that y G dF*(0). 
By noting that F is closed and using (8), we then have 0 G dF{y), which implies that y is the solution 
to problem (14) hence to problem (12). 

By repeating the above discussions we know that problem (13) is also solvable if Assumption 2 holds. 

(b) Note that problem (14) is equivalent to problem (12). By reorganizing the proofs for part (a), we 
can see that Assumption 3 holds if and only if F0~^{y) > 0 for all 0 7 ^ y G y. As a result, if Assumption 
3 holds, from [13, Theorem 27.2] we know that problem (14) has a nonempty and bounded solution set. 
Conversely, if the solution set to problem (14) is nonempty and bounded, by [13, Corollary 8.7.1] we know 
that there does not exist any 0 7 ^ y G 3^ such that F0+(y) < 0, so that Assumption 3 holds. Similarly, 
we can prove the remaining results of part (b). This completes the proof of the proposition. □ 

Based on Proposition 2.1 and its proof, we have the following results. 

Corollary 2.1 If problem (1) has a nonempty and bounded solution set, then both problems (12) and 
(13) have nonempty and bounded solution sets. 
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Proof Since problem (1) has a nonempty and bounded solution set, there does not exist any 0 y G y 
with A*y = 0 such that fO'^{y) < 0 , or 0 ^ z G Z with B*z = 0 such that g0~^(z) < 0. Thus, Assumptions 
3 and 4 hold. Then, by part (b) in Proposition 2.1 we know that the conclusion of Corollary 2.1 holds. □ 

Proposition 2.2 If f {or g) is a closed proper piecewise linear-quadratic convex function [14-, Definition 
10.20], especially a polyhedral convex function, we can replace the “>” in Assumption 1 (or 2) by “>” 
and the corresponding sufficient condition in part (a) of Proposition 2.1 is also necessary. 

Proof Note that when / is a closed piecewise linear-quadratic convex function, the function F defined in 
(14) is a piecewise linear-quadratic convex function with dom F — dom / being a closed convex polyhedral 
set. Then by [14, Theorem 11.14(b)] we know that F* is also a piecewise linear-quadratic convex function 
whose effective domain is a closed convex polyhedral set. By repeating the discussions for part (a) of 
Proposition 2.1 and using [13, Corollary 13.3.4, (a)] we can obtain that Assumption 1 with “>” being 
replaced by “>” holds if and only if 0 G dom F*, or dF*{0) is a nonempty set [14, Proposition 10.21], 
which is equivalent to the fact that argminP is a nonempty set. If g is piecewise linear-quadratic we can 
get a similar result. □ 

Finally, we need the following easy-to-verify result on the convergence of quasi-Fejer monotone se¬ 
quences. 

Lemma 2.1 Let {ak}k>o be a nonnegative sequence of real numbers satisfying ak+i < Ufe + Efc for all 
k > 0, where {efc}fe>o is a nonnegative and summable sequence of real numbers. Then the quasi-Fejer 
monotone sequence {ofc} converges to a unique limit point. 


3 A Counterexample 


In this section, we shall provide an example that satisfies all the properties (P1)-(P4) stated in Section 
1 to show that the solution set to a certain subproblem in the ADMM scheme (3) can be empty if no 
further assumptions on f, g or A are made. This means that the convergence analysis for the ADMM 
stated in [1] can be false. The construction of this example relies on Proposition 2.1. The parameter a and 
the initial point {x^,y^, z^) in the counterexample are just selected for the convenience of computations 
and one can construct similar examples for arbitrary penalty parameters and initial points. 

We now present this example, which is a 3-dimensional 2-block convex optimization problem. 

Example 3.1 Let (5>o(') be the indicator function of the nonnegative real numbers. Consider problem 
( 1 ) with f{yi,y 2 ) ■= max(e"2^i -b ?/ 2 ,2/1), g{z) ■= S>o{z), A* = (0,1), B* = -1, and c = 2 , i.e., 

min I max(e“^i-b 2/2,2/!) + ^>o(^) I 0yi+y2-z = 2 \. ( 15 ) 

(yi,2/2.z)65R3 t J 

In this example, / and g are closed proper convex functions with ri(dom /) = dom f = and 
ri(dom g) = {z \ z > 0} C dom g. The vector (0, 3,1) G 3?^ lies in ri(dom /) x ri(dom g) and satisfies the 
constraint in problem (15). Hence, for problem (15), the Slater CQ holds. It is easy to check that the 
optimal solution set to problem (15) is given by 

{{yi,y 2 ,z) G 3 ?^ I2/1 > -loge 2 , 2/2 = 2, z = 0 } 


and the corresponding optimal objective value is 4. The Lagrangian function of problem (15) is given by 
C{yi,y 2 ,z-,x) = max(e“^i -b 2/2,2/!) + <5>o(^) + x{y 2 - z-2), V {yi,y 2 ,z,x) G 3?"^. 

We now compute the dual of problem (15) based on this Lagrangian function. 

Lemma 3.1 The objective function of the dual of problem (15) is given by 


—a;^/4 — 2x, 

if 

x G (— 00 , —2) 

1 — X, 

if 

XG [-2,-1), 

—2x, 

if 

x G [-1,0], 

— 00 , 

if 

a: G (0 -b 00 ). 




6 


Liang Chen et al. 



Fig. 1 Graphs of the dual objective function h{x) (left) and the function I{y 2 ) (right). 


Proof By the definition of the dual objective function, we have 

h{x) = inf C{yi,y 2 ,z]x) 
yi,y2,z 

= inf { inf(niax(e“^i + 2 / 2 , 2 /|) + ( 2/2 - 2 : - 2)x)} 
z> 0 ,y 2 yi 

= inf {max(j/2,J/i) + 2/22; —— 2 a;} 
z>0,y2 

= min ( inf I 2/2 + 2 / 22 ; — za; — 2x|, inf { 2/2 + 2 / 22 ; — za; — 2a;| 
22 ^y2G[o,i],z>o ^ y2^[o,i],z>o ^ 


For any given a; S 5 ft, we have 

inf |y 2 + V 2 X — zx — 2x| 

J/26[0,1].2>0 '■ 


= inf { 7 / 2(1 + a;)) + inf I — za;) — 2a; = 
y2G[0.1] z>o^ J 


1 — a;, 

—2a;, 

— 00 , 


if X < —1, 
if X G [—1, 0], 
if a; > 0. 


Moreover, for any a; G 3ft, it holds that 

inf I 2/9 + 2 / 22 ; — zx — 2x\ 
y2^[o,i],2>o ^ 

= inf jw? + 2 / 22 ; + a;^/4 — x^/4 — 2a;| + inf I — za;} 
= inf I (y 2 + 2;/2)^| + inf { — zx} — a;^/4 — 2x 



if X < —2, 
if X G [—2, —1], 
if X G [—1, 0], 
if a; > 0. 


Then by combining the above discussions on the two cases we obtain the conclusion of this lemma. □ 


By Lemma 3.1, one can see that the optimal solution to the dual of problem (15) is a; = —4 and the 
optimal value of the dual of problem (15) is h{—A) = 4 (see Fig. 1). Moreover, the set of solutions to the 
KKT system (11) for problem (15) is given by 


{{yi,y 2 ,z,x) e 3ft^ I yi > -loge2, 2/2 = 2 , z = 0, x = -4}. 
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Next, we consider solving problem (15) by using the ADMM scheme (3). For convenience, let cr = 1 and 
set the initial point {x^,yi,y 2 , z^) = (0,0, 0,0). Now, one should compute (yi,?/^) by solving 

min C^{yi,y 2 ,z°;x°). 
yi,y2 

Define the function /(■) : 3? —)■ [— 00 , + 00 ] by 

/(ya) : = inf/:<^(yi, ya, z°; a;°) 
yi 

= inf I max (e"^i + y 2 , yi) + (y 2 - 2 )^/ 2 | 
iyi- 2 y 2 + 2 if ya^[ 0 , 1 ], 

5 y 2 - 2/2 + 2 if yae[ 0 , 1 ]. 

By direct calculations we can see that the above infimum is attained at ya = 1 with /(ya) = 1-5 (see Fig. 
1). However, we have for any yi G 3?, 

Ca{yi, 1,0; 0) = max(e“^^ + 1,1)+ 0.5 = + 1.5 > inf Ca{yi,y 2 , a;°). 

yi,y2 

This means that although infyj^yj Ca-iyi, y 2 , = 1-5 is finite, it cannot be attained at any (yi, ya) G 

3?^. Then the subproblem for computing (y(,yi) is not solvable and hence the ADMM scheme (3) is not 
well-defined. Note that for problem (15), Assumption 1 fails to hold since the direction y = (1,0) satisfies 
A*y = 0 and / 0 +(y) = 0 but / 0 +(—y) = +oo. 

Remark 3.1 The counterexample constructed here is very simple. Yet, one may still ask if the objective 
function / about (yi, ya) in problem (15) can be replaced by an even simpler quadratic function. Actually, 
this is not possible as Assumption 1 holds if / is a quadratic function and the original problem has a 
solution. Specifically, suppose that a G 3? is a given number, Q : 3^ —>■ 3^ is a self-adjoint positive 
semidefinite linear operator and a G 3^ is a given vector while / takes the following form 

f{y) = \{y,Qy) + {a,y)+ a, VyG3^. 

From [13, Pages 67-68] we know that 

(a,y), if Qy = 0 , 

+ 00 , if Qy ^ 0. 

If problem ( 1 ) has a solution, one must have f0'^{y) > 0 whenever A*y = 0. This, together with (16), 
clearly implies that Assumption 1 holds. 




4 Convergence Properties of sPADMM 

The example presented in the previous section motivates us to consider the convergence of the sPADMM 
scheme (4) with a computationally more attractive large step-length. We re-emphasize that the sPADMM 
scheme (4) is a natural yet more useful extension of the ADMM scheme (3) and all the results presented 
in this section are applicable for the AMMM scheme (3). 

For convenience, we introduce some notations, which will be used throughout this section. We use Sf 
and Sg to denote the two self-adjoint positive semidefinite linear operators whose definitions, correspond¬ 
ing to the two functions / and g in problem (1), can be drawn from (7). Let (a;, y, z) G x 3^ x Z be a 
given vector, whose definition will be specified latter. We denote Xe := x — x, ye := y — y and Ze := z — z 
for any {x,y^ z) G X x 3^ x Z. If additionally the sPADMM scheme (4) generates an infinite sequence 
{(a;^,y^,z^)}, for fc > 0 we denote Xe := x^ — x, y^ '■= y^ — y and zf := — z, and define the following 

auxiliary notations 

' := -A[x^ + (1 - T)a{A*yl + ^*4) + - z^)\ - S{y^ - y^-^), 

yk := -B[x’^ + (1 _ r)a{A*y’: + B*z’;)] - T{z’^ - z'^"'), 

( 17 ) 

^ := iFfe + ||z'= - z'=-i||^ + max(l - r, 1 - T-^)a\\A*y^ + H*z*f 

with the convention y~^ = y° and z~^ = z^. Based on these notations, we have the following result. 
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Proposition 4.1 Suppose that {x,y,z) € X x y x Z is a solution to the KKT system (11), and that 
the sPADMM scheme (4) generates an infinite sequence {which is guaranteed to be true if 

Assumptions 1 and 2 hold, cf. Proposition 2.1). Then, for any fc > 1, 

e a/(/), e dg{z^), (18) 

- ^k+1 > + 112/'= + ^ - /III + - ^'^llr 

+ min(l, 1 - T + r-i)aM*/+i + S*/+i|p (19) 

+ min(r, 1 + r - {z'^+^ - z/f 

and 


'Pk 


Pk+i > 2||/+i||L + 2||z,^+i||2 + II/+1 - /III + ||z 


fc+1 


— Z 




+(1 - r)tTM*/+i + S*Z^+1||2 + + B*z, 


fc ||2 


( 20 ) 


Proof For any A: > 1, the inclusions in (18) directly follow from the first-order optimality condition of the 
subproblems in the sPADMM scheme (4). The inequality (19) has been proved in Fazel et al. [5, parts 

(a) and (b) in Theorem B.l]. Meanwhile, by using (B.12) in [5, Theorem B.l] and (5) we can get 

/^(ll/f - 11/+'/) - 111^/^'=+' - - 111^*/+'/ + f ll^*/f 

-^a\\A*y^+^ + B*z^+^f + a{B*iz’^+^ - z^^), M*/+i -b 


-^ll/+'ll|+ 111/111-^11/ 


ir ■ 


Nr 


> ll/+'lll:, + ll/+'lll;, + 111 /+' - /III + ^ 1 /'=+' - ^'= 11 / 

which, together with the definition of Wk in (17), implies (20). This completes the proof. 
Now, we are ready to present several convergence properties of the sPADMM scheme (4). 


Theorem 4.1 Assume that the solution set to the KKT system (11) for problem (1) is nonempty. Suppose 

that the sPADMM scheme (4) generates an infinite sequence {{x^, y^, z^)}, which is guaranteed to be true 

if Assumptions 1 and 2 hold. Then, if 

OO 

re (0,(l + /5)/2) or T > (1-b/S )/2 Sut ^ llx^'*'^ — <-boo, (21) 

k=0 

one has the following results: 

(a) the sequence {x^} converges to an optimal solution to the dual problem (9), and the primal objective 
function value sequence {/(y^) -b giz'^)} converges to the optimal value; 

(b) the sequences {/(y^)} and {g{z^)} are bounded, and if Assumptions 3 and 4 hold, the sequence {y^} 
and {z^} are also bounded; 

(c) any accumulation point of the sequence {(x^,y^,z^)} is a solution to the KKT system (11), and if 
(x°°,y°°,z“) is one of its accumulation point, A*y^ —>■ A*y°°, {Ef -b 5)y*^ —?> (A/ + S)y°°, B*z^ —>■ 
B*z°° and [Eg + T)z'^ {Eg + T)z^ as k ^ oo; 

(d) if Ef + AA* -b 5 >- 0 and Eg + BB* -b 7” 0, then each of the subproblems in the sPADMM scheme 

(4) has a unique optimal solution and the whole sequence {{x^,y^,z^y\ converges to a solution to the 
KKT system (11). 


Proof Let (x, y, z) C A x x Z be an arbitrary solution to the KKT system (11) of problem (1). We 
first establish some basic results and then prove (a) to (d) one by one. In the following, the notations 
provided at the beginning of this section are used. 

Note that |lM*y^|| < ||M*y^ + B*z^\\ + \\B*z^\\ for any k > 0. Then, if t S (0, (1 -b /5)/2), by using 
(17) and (19) we obtain that the sequences 

{||x'=||}, {||/||5W.4-}and{||z'=||r+.B8-} (22) 


are all bounded, 


/c=0 


H I 

k^O 


^k\\2 




/c=0 


■B 


* II2 


oo 

El 


\B*{z’^+^ - zAr < +00 


(23) 
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and 

OO OO 

E E - ^'Wr < +^- (24) 

k^O 

If T > (1 + ■\/5)/2 but J2T=o 11®^''’^ ~ 2 :^|p < + 00 , by using the equality that = Ta{A*y^'^^ + 

B*z'^+^) we know M*J/e +^*4'IP < +oo. Therefore, by using \\A*yl\\ < \\A*y^^+B*z^\\ + \\B*z^^\\ 

and (20) we know that the sequences in (22) are all bounded. Moreover, it holds that 

||g*(^fc+i _ ^fe)||2 < 2\\A*yl+^ + + 2\\A*yl+^ + , 

which, together with (20), implies that (23) and (24) hold. 

To sum up, we have shown that when (21) holds, the sequences in (22) are bounded and (23) and 
(24) hold. This, consequently, implies that {u^} and {u^} are bounded. In the following, we prove (a) to 
(d) separately. 

(a) Since {x^} is a bounded sequence, for any one of its accumulation points, e.g. x°° £ X, it admits a 
subsequence, say, such that lim x^^ = x°°. By taking limits in the first two equalities of (17) 

j-foo 

along with kj for j —^ oo and using (23) and (24), we obtain that 

u°° := lim = -Ax°° and := lim = -Bx°°. (25) 

j—XX) j—>(X> 

From (18) and (8) we know that for any fc > 1, £ df*{u^) and z^ £ dg*{v^). Hence, we can get 

A*y^ £ A*df*{u^) and B*z^ £ B*dg*{v’^) so that 

A*y'^^ +B*z^^ £ A*df*{u'^^)+B*dg*{v'^^), Vj > 0. (26) 

Then, by using (23), (24), (25), (26) and the outer semi-continuity of subdifferential mappings of closed 
proper convex functions we know that 


c £ A*df*{-Ax^) + B*dg*[-Bx°°). 


(27) 


This implies that is a solution to the dual problem (9). Therefore, we can conclude that any accu¬ 
mulation of is a solution to the dual problem (9). To finish the proof of part (a), we need to show 
that {x^} is a convergent sequence. This will be done in the following. 

We first consider the case that r £ (0, (1 -I- ■\/5)/2). Define the sequence {4>k\k>i by 


■■= \\y’:\\l + \\z 


fc ||2 

e llr+trBB* 


,fc-l ||2 


. 7 --I-max(l - T, 1 - T ^)a\\A*y^ + B*z^ 


fc ||2 


From (19) in Proposition 4.1 and the fact that > (f)k, we know that {4>k} is a nonnegative and bounded 
sequence. Thus, there exists a subsequence of {(j)k}, say {(f>ki}, such that lim (pk^ = liminf pk- Since {x^'} 

Z—>-oo fc—>-oo 

is bounded, it must has a convergent subsequence, say, {x^''^}, such that x := lim x^'^ exists. Note that 

{x,y,z) is a solution to the KKT system (11). Therefore, without loss of generality, we can reset x = x 
from now on. By using (19) in Proposition 4.1 we know the nonnegative sequence {^fc} is monotonically 
nonincreasing, and 

lim <Pk = lim <Pki. = hm (—\\xe'^'\\'^ + 4>ki.) = liminf (28) 

fc—>-oo i—^oo ^ i—>-oo T (7 * fc—>-oo 


Since -^Wx^W^ = ‘^k- (pk, we have 

limsup —||Xg|p = limsup{^fc — pk} < limsup <l>k — liminf (/)fc = 0, (29) 

k—>-(x> k—>-oo k—>-oo k—>oo 


which indicates that {x^} is a convergent sequence. 

Second, we need to consider the case that r > (1 -|- •\/5)/2. Define the nonnegative sequence {V'fe} by 

V'fe := lll/elll + IkellEaBB*: V fc > 0. 

From (20) we known that 

'I^k-^k+i > (l-r)aM*y^i+rz^i|^ 

which, together with (23), Lemma 2.1 and the fact that 1 — r < 0, implies that is a convergent 
sequence. As a result, by the definition of ipk we know the sequence {ipk} is nonnegative and bounded. 
Then by choosing proper subsequences of {'tpk} and {x^} and repeating the previous analysis for getting 
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(28) and (29) with and ’P^ being replaced by 'ipk and if/j, we can establish that lim Wj. = liminf?/;fc 

k—>oo fc—>-oo 

and lim sup :^||a:elP = 0- Hence, {x^} is also a convergent sequence. 

k—^oo 

Now we study the convergence of the primal objective function value. One the one hand, since {x, y, z) 
is a saddle point to the Lagrangian function £(•) dehned by (10), we have for any fc > 1, C{y,z;x) < 
C{y’^, z’^; x). This, together with A*y + B*z = c, implies that for any fc > 1, 

fiy) + g{z) - {x, A*y'! + B*z^) < f{y>^) + giz>^). (30) 

On the other hand, from (18) and ( 6 ) we know that 

/(y'=) + (^i^y-y'=) </(y) and g{z^) + {v\ z - z^) < giz). 

By combining the above two inequalities together and using (17) we can get 

fiy)+g{z) - {x'^,A*y'! + B*z^) - {S{y>‘ - y>^-^),y'!) 

_(r(z'= - - a{B*{z>^-^ - z>^),A*y'!) (31) 

-(1 - r)a\\A*y!! + > f{y>^) + g{z>^). 

Since the sequences in (22) are bounded, by using (23), (24) and the fact that any nonnegative summable 
sequence should converge to zero we know the left-hand-sides of both (30) and (31) converge to f{y)+g{z) 
when fc —>• oo. Consequently, lim {/(y*) -|- ff(^^)} = f{y) + g{z) by the squeeze theorem. Thus, part (a) 

A;—>-oo 

is proved. 

(b) From (18) we konw that for any fc > 1, 

/(/) < fiy) - iu\ y-y^) = fiy) - iu\ y) + (u^ /). (32) 

On the one hand, from the boundedness of {u^} we know that the sequence {—(u^',y)} is bounded. On 
the other hand, from (23), (24) and the boundedness of the sequences in (22), we can use 


(M^/) = -{x\A*y'^) - il - T)a{A*y^ + B*z^, A*y'^) 

-a{B*iz’^-'^ -z'^),A*y^} - (5(/ -/"i), ?/'=) 

to get the boundedness of the sequence {{u^,y’^)}. Hence, from (32) we know the sequence {fiy^)} is 
bounded from above. From (11) we know 

/(/) > fiy) + i-Ax.y'^ -y) = fiy) - {x,A*y'p). 


which, together with the fact that the sequences in ( 22 ) are bounded, implies that {fiy^)} is bounded 
from below. Consequently, {/(y^)} is a bounded sequence. By using similar approach, we can obtain that 
{g{z^)} is also a bounded sequence. 

Next, we prove the remaining part of (b) by contradiction. Suppose that Assumption 3 holds and the 
sequence {y^} is unbounded. Note that the sequence {y^/(l -I- ||y^||)} is always bounded. Thus it must 
have a subsequence {y^^/(l -I- \\y^^ ll)}i>o, with {\\y^^ ||} being unbounded and non-decreasing, converging 
to a certain point ^ G y. From the boundedness of the sequences in (22) we know that {A*y'^} and {5y^} 
are bounded. Then we have 


A*^ = A 


■i 


lim 

j^oo 


1 + \\y^ 


lim 

j-fOO 


J\*ykj 

1 -h ||y'=^ || 


= 0 . 


and, similarly, = 0. By noting that ||^|| = 1, one has ^ G {y € 3^ | y 7 ^ 0, A*y = 0,5y = 0}. On the 
other hand, dehne the sequence {df^}j>o by 

rfO := (yO/(l + ||yO||),/(yO)/(l+||yO||)). 

From the boundedness of the sequence {/(y^^ )} and the definition of ^ we know that lim^_,.oo = (C,o). 
Since (y^C/(y^^)) G epi(/): by [13, Theorem 8.2] we know that (^,0) is a recession direction of epi(/). 
Then from the fact that epi(/0''') = 0'''(epi/) we know that /0'''(^) < 0, which contradicts Assumption 
3. The boundedness of under Assumption 4 can be similarly proved. Thus, part (b) is proved. 
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(c) Suppose that ,y°°, z°°) is an accumulation point of {(x^,y*, z^)}. Let be a 

subsequence of {{x^, , z^)} which converges to {x°°,y°°, z°°). By taking limits in (18) along with kj for 

j —> 00 and using (17), (23) and (24) we can see that 

-Ax°°edf{y°°), -Bx°° e dg{z°°) and A*y°° + B*z°° = c, (33) 

which can imply that {x°°,y°°, z°°) is a solution to the KKT system (11). Now, without lose of generality 
we reset {x,y,z) = {x°°,y°°, z°°). Then, by part (a) we know that the sequence defined in (17) 
converges to zero if r S (0,(1 + •\/5)/2), and the sequence {^k} defined in (17) converges to zero if 
r > (1 + •\/5)/2 but J2T=o ~ < + 00 . Thus, we always have 

lim ||y^U+i:/ = 0 and lim ||zf ||r+ai 3 B-+i;, = 0. (34) 

K—¥00 /C —>-00 

As a result, it holds that B*z’^ —)■ B*z°°, {IJf+S)y^ —>■ {IJf+S)y°° and {Sg+T)z^ —)■ {IJg + T)z°° as k ^ 
00 . Moreover, by using the fact that A*y'^ = {A*y^+B*z^)-B*z^ and A*y'^+B*z'^ A*y°° AB*z°° = c 
as A: —)• 00 , we can get A*y^ —)■ A*y°° as k ^ oo. This completes the proof of part (c). 

(d) If A/ + 5 + AA* >- 0 and Sg + T + BB* >- 0, the subproblems in the ADMM scheme (3) are 

strongly convex, hence each of them has a unique optimal solution. Then, by part (c) we know that 
{y^} and {z^} are convergent. Note that {x^} is convergent by part (a). Therefore, by part (c) we know 
that {(a;^, y^, z^)} converges to a solution to the KKT system (11). Hence, part (d) is proved and this 
completes the proof of the theorem. □ 

Before concluding this note, we make the following remarks on the convergence results presented in 
Theorem 4.1. 

Remark 4-1 The corresponding results in part (a) of Theorem 4.1 for the ADMM scheme (3) with t = 1 
have been stated in Boyd et al. [1]. However, as indicated by the counterexample constructed in Section 
3, the proofs in [ 1 ] need to be revised with proper additional assumptions. Actually, no proof on the 
convergence of {x^} has been given in [ 1 ] at all. Nevertheless, one may view the results in part (a) 
as extensions of those in Boyd et al. [1] for the ADMM scheme (3) with r = 1 to a computationally 
more attractive sPADMM scheme (4) with a rigorous proof. The condition that Sf + AA* + 5 0 and 

Sg + BB* +T 0 in part (d) was firstly proposed by Fazel et al. [5]. 

Remark 4.2 Note that, numerically, the boundedness of the sequences generated by a certain algorithm 
is a desirable property and Assumptions 3 and 4 can furnish this purpose. Assumption 3 is pretty mild 
in the sense that it holds automatically, even if 5 = 0 , for many practical problems where / has bounded 
level sets. Of course, the same comment can be applied to Assumption 4. 

Remark 4-3 The sufficient condition that r > (1 + •\/5)/2 but J2T=i < +oo simplifies 

the condition proposed by Sun et al.^ [15] for the purpose of achieving better numerical performance. 
The advantage of taking the step-length r > (1 -|- •\/5)/2 has been observed in [2,12,15] for solving high¬ 
dimensional linear and convex quadratic semi-definite programming problems. In numerical computations, 
one can start with a larger r, e.g. r = 1.95, and reset it as t := max( 7 T, 1.618) for some 7 G (0,1), e.g. 
7 = 0.95, if at the fc-th iteration one observes that for some given positive constant 

Co > 0. Since t can be reset at most a finite number of times, our convergence analysis is valid for such 
a strategy. One may refer to [15, Remark 2.3] for more discussions on this computational issue. 


5 Conclusions 

In this note, we have constructed a simple example possessing several nice properties to illustrate that the 
convergence theorem of the ADMM scheme (3) stated in Boyd et al. [1] can be false if no prior condition 
that guarantees the existence of solutions to all the subproblems involved is made. In order to correct 
this mistake we have presented fairly mild conditions under which all the subproblems are solvable by 
using standard knowledge in convex analysis. Based on these conditions, we have further conducted the 
convergence analysis of the ADMM under a more general and useful sPADMM setting, which has the the 
flexibility of allowing the users to choose proper proximal terms to guarantee the existence of solutions 
to the subproblems. In particular, we have established some satisfactory convergence properties of the 

^ The condition that r > (1 + \/5)/2 but < -l-oo was used in [15, Theorem 

2 . 2 ]. 
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sPADMM with a computationally more attractive large step-length that can exceed the golden ratio of 
1.618. In conclusion, this note has (i) clarified some confusions on the convergence results of the popular 
ADMM; (ii) opened the potential for designing computationally more efficient ADMM-type solvers in the 
future. 
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